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Method and encoder for bit-rate saving encoding of audio 

signals 

The invention relates to a method and an encoder for bit- 
rate saving encoding of audio signals, especially for 
encoding of audio signals according to MPEG 1 Audio Layer 
II. 

Prior art 

The MPEG 1 Audio standard as specified in ISO/IEC 11172-3 
defines three operational modes known as Layers I, II and 
III. Each Layer offers increased compression but also 
increased encoding complexity, whereby downward 
compatibility is guaranteed. Therefore/ a Layer II decoder 
can also read a Layer I data stream but no Layer III data 
stream. Also a Layer III decoder can decode all MPEG 1 
Audio bit streams, i. e. Layer I to III* 

MPEG 1 Audio data compression is based on subband coding, 
The audio signal is split into 32 subbands of equal width. 
A quantization is performed using a psychoacoustic model 
that is adapted to the masking behaviour of the human 
hearing- Each subband signal is quantized in such a way 
that the quantization noise introduced by the coding will 
not exceed the masking curve for that subband. After 
quantization the samples build - together with the scale 
factors and further coding information - a frame structure 
for transmission. 

In ISO/IEC 11172-3 two independent psychoacoustic models 
are defined which can be adapted to any Layer. The output 
from these psychoacoustic models is a set of Signal-to- 
Masking Ratios/ SMR n , for every subband n. 



j ( u i u. i.u/.iy 



PD990060 - kT- 09.0'i;i99^ 

- 2 - 



10 



15 



in order to calculate the SMR n for the psychoacoustic 
model 2 a Fast Fourier Transformation [FFT) with a length 
of 1024 samples is used, which has to run two times per 
channel, i. e . four tiaes for a stereo channel. 

Invention 

The invention is based on the object of specifying * 
method for bit-rate saving encoding of audio signals using 
a psychoacoustical model with reduced computing power 
This object is achieved by means of the method specified 

in claim l . 

It is a further object of the invention to disclose an 
encoder which utilises the inventive method. This object 
is achieved by the apparatus disclosed in claim 6. 

The invention is based upon the following realization. On 
the ens har.d a FFT i s a special discrete Fourier 
Transformation for which the number of samples has to be a 
power of two, e.g. 1024 samples. However, on the other 
hand the frame length of MPEG 1 Audio Layer II is 1152 
samples which is no power cf two. This results « the two 
runs per channel of the FFT according to prior art. 

The known formula for a discrete Fourier Transformation, 
which calculates f or L elements of a data series z (nj the 
30 corresponding L frequency values F(m>, 



20 



25 



can be transformed first into k partial sums with M 
33 summands each 
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and finally with L=3cM and splitting the exponential 
s function into 

Therefore, using k subtrans formations with a length of 
10 M=2^ allow time effective calculation with Fast Fourier 
Transformations/ even if L is not a power of two. 

In principle, the method for bit-rate saving encoding of 
audio signals using a psychoacoustic model, wherein a 
15 Fourier Transformation is performed for calculation of a 
minimum masking threshold and wherein L samples of the 
audio signal are arranged in a frame for transmission, 
consists in the fact that the Fourier Transformation is 
performed with a length of L samples by calculating k 

20 subtrans format ions over 2 N samples with k*2 Na *L and fitting 
together the results of the k subtransformations . 

In this way, the duplicate run per channel is avoided 
without lost of information or introduction of errors. 
25 Therefore, the computing power is reduced nearly to the 
half. This is especially important for the implementation 
on digital signal processors to run in real time. 

Al though the inventive method is especially advantageous, 
30 if the number k of subtransf ormations is not a power of 2, 
the use of the invention is not res triced to such values 
of k. 
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Advantageously, before fitting together the results of the 
k subtrans formations, these are multiplied with phase 
correction factors. 

5 in an advantageous manner, the Fourier Trans formation is 
performed within the algorithm for the psychoacoustic 
model 2 of MPEG I Audio Layer II, wherein the frame length 
L is 1152 samples. 

10 in an advantageous development k=9 subtransformations with 
a length of M=2N . 128 samples are calculated. 



Drawing 



15 



2S 



30 



35 



Exemplary embodiments of the invention are described with 
reference to the figure, which shows schematically the use 
of the subtransformations. 



Exemplary embodiments 

Although the encoder is not standardized in IS0/IEC 11172^ 
3, certain means for encoding such as the estimation of 
the masking threshold or quantization are commonly used 
and therefore are not described in detail in the 
following. First of all PCM audio samples with a sampling 
frequency of 32, 44.1 or 48 kHz (or with half sample 
frequencies of 16, 22.05 or 24 kHz} are fed into the 
encoder. A mapping is performed to creat a filtered and 
subsampled representation of the input audio stream in the 
form of so-called subband samples. Each subband signal is 
quantized in such a way that the quantization noise 
introduced by the coding will not exceed the masking curve 
for that subband. In order to control the quantization a 
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psychoacoustic model is used to calculate the new bit 
allocation, for MPEG 1 Audio Layer II for three blocks 
totaling 36 subband samples corresponding to 1152 input 
PCM samples. The calculation is based on the Signal-to- 

5 Masking Ratios, SMH n , for all the subbands, which makes it 
necessary to determine for each subband the maximum signal 
level and the minimum masking threshold- The minimum 
masking threshold is derived by using the inventive method 
which is schematically shown in Fig, 1. After windowing 

JO the input PCM samples/ samples corresponding to a frame Fl 
are split and fed to nine subtransformation ST1/ ST9 . 
Each subtransformation has a length of 128 samples 

(9*128-1152) which is the seventh power of 2 (2 7 = 128). 
After Fast Fourier Transformations of the partial signals 

15 che results of the nine subtrans format ions are multiplied 
with phase correction factors/ symbolized by multiplier 
units Ml, M9. The phase corrected data are combined/ 
symbolized by adder unit A and then used for the further 
calculation of the psychoacoustic model. The same method 

20 is applied to the following audio samples corresponding to 
frames F2 f F3 etc* 

The invention can advantageously be used for encoding of 
audio signals according to MPEG 1 Audio Layer 2, but also 
25 applies to the encoding of any other kind of digital data. 

The invention can be implemented in any kind of encoders 
which can be used e.g. for Digital Audio Broadcast (DAB), 
Cable and satellite radio/ TV or digital recording devices 
30 like DVD-VRs etc. 
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1- Method for bit-rate saving encoding of audio signals 
■using a psychoacoustic model, wherein a Fourier 
Transformation is performed for calculation of a 
minimum masking threshold and wherein L samples of the 
audio signal are arranged in a frame (Fl, F2, F3) for 
transmission, characterized in that the Fourier 
Transformation is performed with a length of L samples 
by calculating lc subtransformations (ST1, W , ST9) over 
2 N samples with k*2 N =L and fitting together (A) the 
results of the k subtransformations ♦ 

2* Method according to claim 1, wherein the number k of 
subtrans formations is not a power of 2. 

3. Method according to claim 1 or 2, wherein before 
fitting together the results of the k 
subtransformations, these are multiplied with phase 
correction factors (Ml,.-, M9) . 

4. Method according to any of claxms 1 to 3, wherein the 
Fourier Transformation is performed within the 
algorithm for the psychoacoustic model 2 of MPEG I 
Audio Layer II and wherein the frame length L is 1152 
samples. 

5. Method according to claim 4, wherein k=9 

subtransformations with a length of M=2 N ~ 128 samples 
are calculated. 



6. Encoder for performing the method according to claim 1, 
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Abstract 



MPEG 1 Audio data compression is based on subband coding. 
A quantization is performed using a psychoacoustic model 
5 which is adapted to the masking behaviour of the human 
hearing. Each subband signal is quantized in such a way 
that the quantization noise introduced by the coding will 
not exceed the masking curve for that subband. in ISO/IEC 
11172-3 two independent psychoacoustic models are defined. 

10 The output from these psychoacoustic models is a set of 
Signal-to-Masking Ratios, SMR n/ for every subband n. In 
order to calculate the SMR n for the psychoacoustic model 2 
according to the invention a Fast Fourier Transformation 
(FFT) is performed with a length of L=li52 samples by 

15 calculating k subtrans formations (ST1,..., ST9) over 2 N 
samples with k*2 N =L and fitting together (A) the results 
of the k subtransformations. 
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